Interaction dynamics of two reinforcement learners

نویسنده

  • Walter J. Gutjahr
چکیده

The paper investigates a stochastic model where two agents (persons, companies, institutions, states, software agents or other) learn interactive behavior in a series of alternating moves. Each agent is assumed to perform “stimulus–response–consequence” learning, as studied in psychology. In the presented model, the response of one agent to the other agent’s move is both the stimulus for the other agent’s next move and part of the consequence for the other agent’s previous move. After deriving general properties of the model, especially concerning convergence to limit cycles, we concentrate on an asymptotic case where the learning rate tends to zero (“slow learning”). In this case, the dynamics can be described by a system of deterministic differential equations. For reward structures derived from [2 × 2] bimatrix games, fixed points are determined, and for the special case of the prisoner’s dilemma, the dynamics is analyzed in more detail on the assumptions that both agents start with the same or with different reaction probabilities.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Which Types of Learning Make a Simple Game Complex?

The present study focuses on a class of games with reinforcement-learning agents that adaptively choose their actions to locally maximize their rewards. By analyzing a limit model with a special type of learning, previous studies suggested that dynamics of games with learners may become chaotic. We evaluated the generality of this model by analyzing the consistency of this limit model in compar...

متن کامل

A model for the evolution of reinforcement learning in fluctuating games

Many species are able to learn to associate behaviors with rewards as this gives fitness advantages in changing environments. Social interactions between population members may, however, require more cognitive abilities than simple trial-and-error learning, in particular the capacity to make accurate hypotheses about the material payoff consequences of alternative action combinations. It is unc...

متن کامل

The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems

Reinforcement learning can provide a robust and natural means for agents to learn how to coordinate their action choices in multiagent systems. We examine some of the factors that can influence the dynamics of the learning process in such a setting. We first distinguish reinforcement learners that are unaware of (or ignore) the presence of other agents from those that explicitly attempt to lear...

متن کامل

L2 Discourse Co-construction within the Learner’s ZPD

This study examined the effect of the ZPD-based discourse scaffolding on EFL learners' co-construction of L2 metadiscourse performing collaborative writing tasks and explored the discourse scaffolding dynamics. The participants were 160 EFL students that were assigned to four different treatment conditions: (i) formal teaching, (ii) input enhancement, (iii) non-ZPD interaction, and (iv) ZPD-bas...

متن کامل

Agent Based Virtual Tutorship and E-Learning Techniques Applied to a Business Game Built on System Dynamics

An advanced Business Game is presented in the paper, built on the methodology of System Dynamics. It can be used for cognitive learning and knowledge transmission in schools and Universities; it allows the learners to take decisions at each time step, after which it calculates the corresponding results, showing them according to the principles of double entry accounting. An agent based framewor...

متن کامل

Evolution of cooperation facilitated by reinforcement learning with adaptive aspiration levels.

Repeated interaction between individuals is the main mechanism for maintaining cooperation in social dilemma situations. Variants of tit-for-tat (repeating the previous action of the opponent) and the win-stay lose-shift strategy are known as strong competitors in iterated social dilemma games. On the other hand, real repeated interaction generally allows plasticity (i.e., learning) of individu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CEJOR

دوره 14  شماره 

صفحات  -

تاریخ انتشار 2006